Lexical Sets: Relevance and Probability
نویسنده
چکیده
Preliminary findings from corpus analysis suggest that the semantics of each verb in the language are determined by the totality of its complementation patterns. Accurate description of those patterns requires a lev el of analytic delicacy which was not possible until the advent of large bodies of data, along with techniques for distinguishing significant patterns from mere noise. Such analysis is in its infancy, but it is already clear that, in order to analyse the semantics of verbs empirically, it is necessary to identify typical subjects, objects, and adverbials and to group individual lexical items into sets within those clause roles. The nature of lexical sets is discussed and an attempt is made to identify the range of semantic and syntactic phenomena encountered in verb analysis, in both monolingual and bilingual contexts. Verbs and Clause Roles Let me start by inviting you to participate in an exercise which proves popular and instructive with students who have reasonably sophisticated intuitions about English. Exercises of this sort are comparatively easy to construct nowadays, given the availability of corpus data and concordancing programs such as MICROCONCORD. The exercise is to identify the one word (there is only one) which fits all 30 of the concordance lines in Figure 1, which are selected from the 300-odd citations for this verb in the Oxford ‘Hector’ Pilot Corpus (a pilot study of 18 million words, compiled in 1990−92, for what is now the British National Corpus). The British National Corpus itself, completed in 1995, contains 3888 citations for the verb in question and a further 702 noun uses. What is the missing word? A typical seminar discussion usually goes something like this: LINES CANDIDATES COMMENTS 1−4 ‘said’, ‘asked’ 5 NOT ‘said’ ‘said’ does not take person as direct object. 6−8 seems to confirm ‘asked’ ?‘requested’, ??‘proposed’ can’t be ‘proposed’ because it doesn’t fit 5 10−13 ?‘told’ ‘told’ doesn’t fit 1-3. 17−18 ‘ask’ is beginning to look too weak. 24 Rules out ‘asked’. You can ask a question, but you can’t ‘ask caution’. (You can, of course, ‘ask for caution’.) Is it ‘requested’? But that’s even less OK for 17-18. 25−26 by this point most classes will have suggested ‘urged’. 28 note the Ford Sierra as a metaphorical steed. 29−30 What can we learn from such an exercise? First, it appears that the set of normal complementations for this verb is unique. If it is deleted from a text, two or three candidates−−perhaps more−−may fit the slot that it leaves. But cumulatively the complementations add up to a unique set of patterns, ruling out all other candidates. The same seems to be true of all (or almost all) verbs in English. The implications of this phenomenon have not been fully explored. However, one such implication is that the semantics of the verb are determined by the totality of its complementation patterns. So, for example, the systemic choice by the utterer of urged in lines 1 and 2 (in preference to, e.g. ‘said’ or ‘asked’ or ‘told her’ or ‘proposed’) is in part caused by the utterer’s subconscious knowledge that urge is associated both with riders, horses, and forward movement (as in lines 26 and 27) and with petitioners, governments, and positive actions (as in lines 15 and 16). This is consistent with the Firthian programme of ‘‘knowing a word by the company it keeps’’ (Firth 1957). Secondly, while broad subcategorizations such as ‘transitive’ and ‘intransitive’ are a helpful first step towards a lexicosyntactic analysis, more delicate subcategorizations are necessary for the proper understanding of a lexical item: we noted, for example, that the phraseology ‘ask caution’ is not conventional in modern English (though by no means ungrammatical). There are 25 matches for ‘urge caution’ in the British National Corpus, but none for ‘ask caution’. It is not, of course, the transitivity that is in question here. Transitive uses such as ‘ask a question’ and ‘ask a person a question’ are quite normal. The problem lies in selecting the particular noun caution as the direct object of asked. In the literature, such subcategorizational phenomena are often referred to as selectional restrictions, but it would be more accurate to call them selectional preferences. A restriction prevents or forbids you from doing something; but it is often the case that locutions excluded by a selectional preference are nevertheless perfectly grammatical, psychologically acceptable, and communicatively adequate. They are just not conventional. They deviate from an established norm. But what is this ‘established norm’? You will not find it described fully in any published work. Indeed, because of the flexible, variable nature of the lexicon, it may be a fool’s errand to even attempt a full description. Nevertheless, it is probably useful to persevere and to try to encapsulate the invariant core that lies at the heart of this rather variable phenomenon, the conventional use of an English word. Before attempting to attach meanings or definitions to a word, we must first account for the various syntactic and collocational patterns in which the word participates. Most monolingual dictionaries list senses without making any serious attempt to say how one sense is to be distinguished from another. Some bilingual dictionaries, as we shall see, do at least make an attempt to address the problem, indicating informally conditions for choosing one foreign-language equivalent rather than another. These, however, do not always systematically cover the main usages. A more explicit account is needed. Figure 2 is a behavioural profile of the verb urge: an attempt to encapsulate its established norms (patterns of usage) on the basis of observation of a body of evidence of actual usage (a corpus). Particular meanings (strictly speaking, meaning potentials: see Hanks 1994) or particular translations can be associated with each of these patterns of usage. Only when we have described adequately the different lexicosyntactic patterns in which the word participates can we decide how to attach definitions or translations to each of them, decide whether to lump them together or split them still further, and so on. How was this summary arrived at? The corpus was searched for all occurrences of the lemma urge. The matches were then painstakingly classified according to the contexts in which they occur. The most common pattern a person urging another person to do something accounted for 61% of the uses, while an older sense (a person urging a steed onwards) accounted for only 3.5%. Such imbalance is typical of the distribution patterns of most polysemous words, although standard dictionaries give no hint of it, giving equal weight to all senses, even the rarest. Indeed, many modern dictionaries (British and American) give first place to comparatively rare senses of words insofar as these are the oldest or longest-surviving senses. Let us now look at a couple of points of detail in Figure 2. Firstly, the percentages given do not sum to 100%. This is because about 10% of the uses are either what I call exploitations metaphors, figurative uses, etc. or simply unclassifiable on the available data. Thus, although the sentence "urging his Sierra through Grizedale Forest" is perfectly natural and interpretable, it would be wrong to classify a Ford Sierra as a member of the lexical set STEED. Canonically, for this sense of urge, psychological persuasion is implied. A rider who urges his horse over a jump tries to influence the animal’s psyche. But a motor car does not have a psyche for the driver to influence. So it is better to regard Sierra as an ‘honorary’ or ‘nonce’ member of the set of steeds. Classifying it as a full member of the set would have disastrous consequences for the usability of such sets in lexical analysis: they would be hopelessly broad and allinclusive. Some unifying intensional property must be sought, if these sets are to play a part in the decision procedures that help us distinguish one sense from another. Howev er, rather than getting bogged down in stating the intensional criteria explicitly, it may be better to give the set a name (coined ad-hoc as a mnemonic, and bringing with it no theoretical baggage), and define it extensionally, simply by listing all its members (that is, those judged by the analyst to be bona fide members) as found in the corpus. Thus, in the context of urge, the members of the set STEED found in the British National Corpus in this context are ‘a horse’, ‘his horse’ (X 8), ‘his large roan horse’, ‘his mount’ (X 5), ‘the black stallion’, ‘his pony’ (X 3), and various named horses (Chalon, Contralto, Fontana, Nero, and Violet). The Oxford Hector Pilot Corpus also contains ‘two stalling camels’, an example of the class of steeds that is not, alas, represented in the BNC. Notice that the steed sense of urge is usually accompanied by an adverbial of direction: on, onwards, forward, along, into the shallow water, up the path, down the rutted lane, through the desert, and up the slope. By far the most common is urge on, which some people would classify as a phrasal verb. This small set of STEEDS with an adverbial of direction is closely paralleled by the set of people with adverbial of direction: ‘urging practitioners towards greater involvement’, ‘urging on my more sluggardly companions’, ‘urged the Party on’, and so forth. This particular behavioural norm, then, consists of two features, composed of two probabilistic lexicosemantic sets and the probabilistic correlation between them. The verb urge complemented by (1) a word denoting a horse as direct object plus (2) an adverbial of direction (‘‘urged Fontana up the path’’) is in systemic contrast with the verb urge complemented by (1) the name of a person or group as direct object plus (2) a to infinitive (‘‘urged Stella to blot out the memory’’). The intermediate pattern, consisting of the verb urge complemented by (1) the name of a person or group as direct object plus (2) an adverbial of direction is often somewhat metaphorical (e.g. ‘‘urging practitioners towards a greater involvement. . .’’). Collocations can be misleading to the unwary, and great caution in assigning patterns to interpretations is called for. For example, urge collocates with two quite different uses of on. On the one hand, we have people urging steeds on, where the particle is intransitive, but on the other hand there is a pattern exemplified as ‘‘to urge a course of action on someone’’, where the particle is transitive and the interpretation is quite different. If we sort out the conventional uses of words in the way suggested here and start by compiling a ‘‘dictionary without definitions’’, we find that a tiny number of patterns (at most half a dozen) account for a very high proportion of all uses (70 or 80 percent, if not more). The remaining, less conventional uses generally fall into place in relation to one or more of the major patterns. Thus, the citation The communiqué urged prudence is related to the pattern ‘‘[PERSON] urge [ATTITUDE]’’ by virtue of the fact that commuunique/*’s are utterances by means of which persons express attitudes. This may seem like a painful way of restating the obvious. But we need to say precisely what the conventions of use are before we can say how they are used and exploited to create meanings. Other exploitations are more dramatic and complicated. Exploiting a word in a text may be, but is not necessarily, an esoteric creative activity. Every use of word is an exploitation of its meaning potential to create a meaning. Each of the lexicosyntactic patterns is associated with a meaning potential. What might these meaning potentials be like? Rather than cite traditional monolingual definitions, it might be better to think in terms of presupposition and implicature. In this connection, Anna Wierzbicka’s comments (1987) on the meaning of urge (Figure 3) are relevant. One conclusion we might draw from Wierzbicka’s comments is that a list of presuppositions and implications would be at least as interesting as a list of definitions. Another (which she herself would like us to draw) is that urge has only one basic sense, not five or six as shown in standard dictionaries. This is a controversial claim which we do not need to go into. The decision whether to lump or split senses may be no more than a matter of differences in the analysts’ tastes. With this by way of background, let us now turn to look briefly at what bilingual dictionaries do with this word, and then, even more briefly, contrast it with a couple of semantically related English words not discussed by Wierzbicka. ‘Urge’ in Bilingual Lexicography In any standard dictionary (monolingual or bilingual), many words (especially verbs) are listed as having more than one sense; bilingual dictionaries offer more than one translation. How is the translator, especially the naive translator, to know which one is appropriate? Figure 4 shows the Oxford−Hachette entry for English urge; Figure 5 shows the entry for the same word in the Oxford−Duden English−German Dictionary. As translations for urge the Oxford−Duden English−German Dictionary offers variations on the basic theme of dra .. ngen, mahnen, and treiben. The Oxford−Hachette English-French Dictionary offers conseiller (vivement), pré coniser, dé conseiller, insister,
منابع مشابه
A Study of Document Relevance and Lexical Cohesion between Query Terms
Lexical cohesion is a property of text, achieved through lexicalsemantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Experim...
متن کاملL1 Glossing and Lexical Inferencing: Evaluation of the Overarching Issue of L1 Lexicalization
This empirical study reports on a cross-linguistic analysis of the overarching issue of L1 lexicalization regarding two (non)-interventionist approaches to vocabulary teaching. Participants were seventy four juniors at the Islamic Azad University, Roudehen Branch in Tehran. The investigation pursued (i) the impact of the provided (non)-interventionist treatments on both sets of (non)-lexicalize...
متن کاملLinguistic Means of Description of Family Relations in the Novel “In Chancery” By J. Galsworthy
The article is devoted to the study of the evaluative component of the meaning of lexical means used to describe relations between family members in the novel “In Chancery” by J. Galsworthy. The relevance of t &he study can be attributed to the lack of works devoted to this problem. As the results of our study demonstrate, the words of the lexical-semantic group “family” were mainly used to ver...
متن کاملThe Impact of Metalinguistic English Vocabulary Knowledge and Lexical Inferencing on EFL Learners’ Lexical Knowledge Considering the Cross-Linguistic Issue of L1 Lexicalization
The present study endeavors to unravel the enigma of the psycholinguistic mechanisms underpinning bilingual mental lexicon by analyzing the issue of L1 lexicalization as a construct epitomizing an overarching framework. It involves 78 juniors at the Islamic Azad University, Roudehen Branch. The study inspects the impact of the interventionist/noninterventionist treatments on both sets of lexica...
متن کاملOn document relevance and lexical cohesion between query terms
Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexica...
متن کاملApplication of tests of goodness of fit in determining the probability density function for spacing of steel sets in tunnel support system
One of the conventional methods for temporary support of tunnels is to use steel sets with shotcrete. The nature of a temporary support system demands a quick installation of its structures. As a result, the spacing between steel sets is not a fixed amount and it can be considered as a random variable. Hence, in the reliability analysis of these types of structures, the selection of an appropri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997